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Abstract 

Large populations may contain numerous simultaneously segregating polymorphisms subject to natural selection. 
Since selection acts on individuals whose fitness depends on many loci, different loci affect each other's dynamics. 
This leads to stochastic fluctuations of allele frequencies above and beyond genetic drift - an effect known as genetic 
draft. Since recombination disrupts associations between alleles, draft is strong when recombination is rare. Here, 
we study a facultatively outcrossing population in a regime where the frequency of out-crossing and recombination, 
r, is small compared to the characteristic scale of fitness differences a. In this regime, fit genotypes expand clonally, 
leading to large fluctuations in the number of recombinant offspring genotypes. The power law tail in the distribution 
of the latter makes it impossible to capture the dynamics of draft by an effective neutral model. Instead, we find that 
the fixation time of a neutral allele increases only slowly with the population size but depends sensitively on the ratio 
r/a. The efficacy of selection is reduced dramatically and alleles behave "quasi-neutrally" even for Ns ^ 1, provided 
that \s\ < Sc, where Sc depends strongly on r/a, but only weakly on population size N. In addition, the anomalous 
fluctuations due to draft change the spectrum of (quasi)-neutral alleles from /(i/) ~ corresponding to drift, to 
^ i'^^. Finally, draft accelerates the rate of two step adaptations through deleterious intermediates. 



Introduction 



The genetic diversity of a population is determined by mutation, selection, recombination and genetic drift, i.e. the 
stochasticity inherent in reproduction. Understanding how genetic diversity depends on these elements of evolutionary 
dynamics is central to population genetics, since it allows to make inferences about the past history and to predict 
how rapidly populations can adapt. 

Population genetic inference focuses on the diversity at putatively neutral sites and assumes that the history of these 
sites is described by the neutral "coalescent" (KiNGMAN 1982). Coalescent theory models the genealogy of asexual 



organisms or non-recombining segments of a genome by positing that lineages merge at random, backwards in time, 
due to common ancestry. Under this assumption, the mean time to the most recent common ancestor, Tc, of the 
extant TV individuals, is 2N generations. The coalescence time scale is very important, since the genetic diversity of 
the population is given by the number of mutations that occur in all lineages descending from the most recent common 
ancestor of the population. Genetic diversity is therefore controlled by Tc and hence, under the assumption of neutral 
evolution, proportional to N. (Coalescent theory has been extended to weak selection (Krone and Neuhauser 
[r997| ) and recombination ( [Griffiths and Marjoram||1996[ [Hudson||1983[ ).) 



However, the prediction that neutral genetic diversity is proportional to N is at odds with observations: population 
sizes of different organisms differ by many orders of magnitude, while genetic variation among organisms is fairly 
constant ( |Lewo ntin 1974) . To resolve this 'paradox of variation', [Maynard Smith and Haigh] ( 1974[ ) suggested 
that selection acting on linked loci might reduce diversity at a neutral locus. Rapid fixation of a novel mutation at a 
linked locus will perturb the allele frequencies. These perturbations can bring alleles to fixation and, more generally, 
reduce the coalescence time and hence the average genetic diversity ( Barton||1998 Gillespie||2001 Kaplan et al. 
[1989'). Such "hitch-hiking" of neutral alleles on linked selected loci will dominate over genetic drift in large populations. 
Since hitch-hiking leads to larger perturbations for more closely linked loci, one expects genetic variation to correlate 
with the recombination rate, as is indeed observed in Drosophila ( Andolfatto and Przeworski|200I Begun and 
Aquadro[[1992[ [Sella et al[[2009[ [Stephan and Mit chell"'1992[) 



A related effect was described earlier by HiLL and RobertsoNi ( 1966 1, who studied the reduction in the fixation 



probability of a novel beneficial mutation because of selection acting at a linked locus. This effect is commonly known 
as Hill- Robertson interference ( Felsenstein|1974 ) . Hitchhiking and Hill-Robertson interference are different aspects 
of the same phenomenon, one focusing on the effects of linked selection on genetic diversity, the other on the efficacy 
of selection. While hitchhiking and Hill-Robertson effects are primarily associated with positive selection for novel 
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alleles, purifying selection against deleterious mutations also affects genetic diversity. The elimination of (neutral) 
alleles linked with deleterious mutations is known as background selection. The lower the recombination rates are, 



the larger is the target for linked deleterious mutations, resulting in stronger background selection (|Charlesworth 
et aL||1993[ [Hudson and Kapla n '1995! |Nordborg et a/.||1996l ) 



Most models used to study Hill-Robertson and hitch-hiking effects between positively selected mutations consider 
only two loci. Deleterious mutations, however, are expected to be much more frequent, and background selection 
models typically consider many mutations with small deleterious effects. A systematic study of the effect of interfer- 
ence between many weakly selected sites in a mutation/selection/drift equilibrium was presented by McVean and 



Charlesworth ( 2000 ) , who used computer simulations of a model of codon bias evolution. They showed that linkage 
dependent interference between a large number of weakly selected sites has substantial effects on genetic diversity, 
fixation probability of mutations and the degree of adaptation measured as the frequency of preferred codons. This 
and subsequent computational studies reinforced the understanding that the Hill-Roberson effect reduces the effec- 
tiveness of selection and made clear that a quantitative un derstanding of Hill- Robertson effects in multi-locus sys tems 
requires an analysis th at goes beyond two locus models (jCOMERON and Kreitman||2002| |Seger et aL||2010|), see 
( [COMERON et aL|[2008l ) for a recent review. 



It is common to interpret the effect of linked selection in terms of increased variance in offspring number. In this 
interpretation, linked selection can be thought of as a stochastic force analogous to genetic drift and is often referred 
to as genetic draft - a term coined by |Gillespi"e| (|2000|) . Following [Hill and Robertson| (|1966[) and [Felsenstein 



(1974), the increased variance in offspring number is often captured by a reduction in the "effective population" size, 
A^e, in a neutral model (which means enhanced drift and accelerated coalescence). It has, however, been noted that 



a rescaled neutral model does not consistently explain all observables ( 


Barton and Etheridge||2004 


Braverman 


et a/.|1995 


Charlesworth et aL|1993 


Fay and Wu|2000| 


McVean and Charlesworth|2000 


|Seger et aL|2010l, 



(EwENS 2004 Karasov et al. 2010). Furthermore, the dependence of A^e on the actual population size and other 



relevant parameters is not understood ( |Gillespie||2000[ |Lynch|[2007} [Wiehe and Steph"an]|1993[ ) 



Here, we provide analytic results on the effect of draft in a stochastic multi-locus evolution model. Instead of a 



mutation/selection equilibrium considered in , McVean and Charlesworth (2000), our focus here is a continuously 
adapting and facultatively sexual population, like HIV in coevolution with the host's immune system. Our model 
and its relation to the biology of HIV are described in more detail below. The scope of the model, however, extends 
beyond HIV and is equally applicable to scenarios where background selection is dominant. Many important and well 
studied organisms such as influenza, yeast and plants are facultatively sexual. Rice, for example, is a partly selfing 
species and strong selection has acted during its domestication ( Caicedo et a/.||2007 ). While dominance effects can 
render the selfing of diploid organisms more complicated than facultatively sexual propagation of haploid organisms 
( [Charlesworth et aL||1991||KELLY and Williamson||2000 ), our analysis should still provides a null model on top 
of which dominance effects can be investigated. 

Using computer simulations of an adapting population, we first demonstrate how quantities such as the coalescence 
time, the fixation probability of beneficial or deleterious mutations, as well as the allele frequency spectra, depend on 
the rate of outcrossing relative to selection. We also show that our in silico observations cannot be described by a 
neutral model with a reduced effective population size. This is because single genotypes can, through clonal expansion, 
give rise to a wildly fluctuating number of recombinant genotypes. The distribution is so broad, that its variance 
diverges, which in turn makes an effectively neutral diffusion limit impossible. To provide an analytic understanding 
of the simulation results, we use a branching process model that allows us to study the stochastic dynamics of 
novel mutations (neutral, beneficial and deleterious) as they spread through the population. We calculate fixation 
probabilities and the typical time to fixation, Tfi^, (and more generally, the probability of attaining n copies after 
time T) for a new mutant allele, making explicit the dependence on the rate of recombination, fitness variance and 
the population size. An important consequence of genetic draft is a qualitatively different frequency spectrum of rare 
alleles, which we also calculate analytically. Finally, we will show that empirical HIV allele frequency spectra are in 
agreement with our theoretical prediction, confirming the relevance of our model to the dynamics of HIV adaptation. 



Model 



Our model is inspired by the intra-patient evolution of HIV. We will first outline briefly the biology of HIV, and then 
describe our computational model and the branching process approximation used to study the phenomena analytically. 
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Intra-Patient evolution of HIV 



After successful infection, the virus proliferates quickly to levels of more than 10^ viral genomes per milliliter blood. 
This acute phase lasts for several weeks, after which the immune system of the host reduces the number of viral 
genomes per milliliter blood to roughly 100 — 10"* . The following chronic infection can last for many years and is 
largely asymptomatic up to the onset of AIDS. However, even in the chronic phase of infection the to tal number of 
viruses produced per day is estimated to be ~ 10^° per ml, while the generation time is roughly two days (Markowitz 
\et a/.||2003 Perelson et al. 1996). The viral population is subject to clearance by cytotoxic T-lymphocyte (CTL) 



mediated cell death and anti-body mediated destruction of virus ( Paul| 2008 1 , whic h puts the virus und er constant 



selection to change its proteins. CTL epitopes are found throughout the HIV genome ( KoRBER et al. 2007 ) and escape 
mutations are often selected for with coefficients of a few percent per generation ( AsQUiTH et a/.||2006 |. Antibodies 
bind primarily to the products of the envelope gene. In response to antibody recognition, numerous escape mutants 
arise in the envelope gene and spread through the population driven by selection coefficients of one to a few percent 
(Neher and Leitner 2010 Williamson 2003). Additional, even stronger, selection pressure is put on the viral 



population by drug treatment, resulting in rapid emergence of resistance during suboptimal therapy (Hedskog et al. 
2010 Larder et aL|[l989) ). Several hundred mutations are implicated in drug resistance ( Rhee et aL||2003| ^ 

HIV carries two copies of its RNA genome (10'' bps), from which one cDNA is produced and integrated into the 
host cell genome. Recombination in HIV occurs through frequent template switching of the reverse transcriptase in 
the process of cDNA (Levy et al. 2004) production. In other words, the two genomes of the "diploid" virion are 



combined to produce a "haploid" cDNA, from which all the viral proteins and the next generations genomes are 
produced. The rate of recombination is limited by co-infection of host cells with several viruses, which is necessary 
to produce a heterozygo us virus (see Figure [TK ). Estimates suggest an effective recombination rate of « 1.5 x 10~^ 
per base per generation ( Neher and Leitner|[20T0 ), corresponding to a co-infection rate < 10%. Viruses therefore 
undergo clonal amplification most of the time, while different parts of the genome are only weakly linked in the event 
of "outcrossing" (heterozygote formation, followed by template switching). 



Computational model 

Based on the discussion above our model must include the following elements: a large population, selection at many 
polymorphic loci, a constant supply of beneficial mutations, and facultative mating with substantial reassortment in 



case of outcrossing. Models combining all of these elements have already been established in (Neher et al. 2010 



ROUZINE and Coffin 2005 2010 1 and a similar model will be used here. In our simulation a (nearly) constant 
variance of fitness in the population, ct^, is maintained by a constant supply of beneficial mutations. This results in a 
continuously adapting population, with a rate of adaptation given approximately by the variance in fitness. Of course, 
one does not expect constant selection to persist indefinitely (see ( Mustonen and Lassig||2010 ) for a discussion of 



evolution in changing environments): steady conditions on the time scale of fixation of a single allele would suffice for 
the validity of our analysis. 

To simulate population dynamics with the ingredients described above, we implemented a discrete time (Fisher- 
Wright) dynamics of individuals with haploid genomes and a large number of loci which additively define the (log- 
)fitness X of an individual. Each individual contributes a Poisson distributed number of gametes to the next generation 
with mean exp(a; — x — a), where x is the current mean fitness and a is adjusted to keep the population size 
approximately constant at iV. To implement facultative mating, a fraction r of the gametes are paired at random, the 
alleles at all of their loci reassorted, and the two resulting recombinant offspring are placed into the next generation. 
The remaining 1 — r fraction of gametes is placed unchanged into the next generation. New beneficial mutations with 
effect size sq (Nsq 3> 1) are introduced into a randomly chosen individual at rate NUt 3> 1. After equilibration, 
the fitness distribution in the population is approximately Gaussian with a nearly constant variance and steadily 
increasing mean fitness. Into this adapting population wave, we introduce additional mutations, neutral, deleterious 
or beneficial, and track their dynamics. In particular, we record after what time these mutations reach certain 
frequency thresholds and the fitness of genotypes on which successful mutations originate. We also measure the allele 
frequency spectra and the cumulative number of individuals that carried a particular mutation before it goes extinct. 
A further, and more detailed, description of the implementation is given in Appendix [Bj 

Our focus here is on rapidly adapting populations with many concurrent sweeps responsible for the fitness variance 
(T~. HIV, as any other organism, also suffers from deleterious mutations which vastly outnumber the beneficial 
mutations. Deleterious mutations also contribute to fitness variation and increase cr^. For the present investigation, 
it docs not matter whether fitness variation is due to sweeping beneficial mutations, deleterious mutations subject to 
purifying selection, or a combination of the two. The two extreme scenarios are illustrated in Fig. [Tj3. We shall see 
that the fate of novel mutations depends on how the fitness of clones changes with respect to the mean fitness of the 
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population. It is irrelevant whether this fitness difference decreases because the mean fitness increases, or because the 
fitness of the clone decreases due to deleterious mutations. 
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FIG. 1 Panel A illustrates the process of recombination in HIV. Diversifying recombination occurs only in "heterozygote" 
virions, which require multiple infections of the host cell in the previous generation. This is rare, leading to an effectively 
facultative sexual population. Panel B illustrates the dynamics of the fitness distribution. Selection and mutation are shown 
as separate steps, for simplicity. Selection, acting between the left and center panel, results in a shift of the fitness distribution 
towards higher fitness. In a deleterious mutation/selection balance, this increase in fitness is compensated by deleterious 
mutations occurring during the step leading from the center to the right panel. If deleterious mutations are not the dominant 
force, but continuous adaptation to a coevolving host is modeled, the adaptation of the host counter balances the fitness advance 
of the population. 



Mathematical model 



To understand the results of the computer simulations, we analyze analytically an idealized model, which describes 
the propagation of a new allele (under selection, drift and occasional recombination onto a different genetic back- 
ground) within a population with a fitness distribution approximated by a Gaussian traveling wave. This is justified 
if many beneficial mutations are sweeping through the population at the same time (which requires NUb 3> 1 and 
is true for HIV) and is supported by our numerical simulations. Gaussian fitness distributions are expected in large 
asexual populations (IDesai and Fisher||2007I IRouzine et aL||2003l ITsimring et aL||1996|) and in the context of 



background selection (the deterministic equilibrium distribution in case of multiplicative selection is Poisson ( Haigh 
1978), which is well approximated by a Gaussian in the limit of interest). Recombination stabilizes the Gaussian 



form if outcrossing rates are much larger than the inverse of the coalescence times (^Cohen et al. 



2005 



RouziNE and 



COFFIN||2005| , see Fig. SI in the supplement. Selection moves this Gaussian distribution towards higher fitness at a 



rate v equal to the variance in fitness (Fisher's "Fundamental Theorem" of natural selection) 

Branching process approximation. The fate of a novel allele is decided mainly at early times when the allele is rare 
and the finite population size constraint can be neglected. In this case the dynamics of the novel allele is governed by 
a stochastic branching process which in addition to the birth/death events also accounts for the recombination process 
which transfers the novel allele to different genetic backgrounds with different fitness x' ( BARTON|1995a Neher et al. 
[2010) . Within the branching process approximation the probability of finding n copies of the allele (with effect s on 
fitness) anywhere in the population at time T, given that there were k copies on background genotypes with fitness 
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X at time t, obeys the equation (see Supplement) 

-dtp{n, T\k, t,x)=- k{B + D + r)p{n, T\k, t, x) + kBp{n, T\k + 1, t, x) + kDp{n, T\k - 1, i, x) 
+ 7-/;;^ j dx'Kxx'p{n-rL ,T\k-l,t,x)p{n\T\l,t,x') , 



(1) 



where B = 1 + a; — x{t) + s is the birth rate, D — 1 the rate at which individuals die (used to set the unit of 
time), and r is the outcrossing rate. The first term describes the probability flux out of state {k,t). The second term 
corresponds to a birth, which happens with rate kB and in which case the final state {n, T) is reached with probability 
p{n,T\k + l^t,x). Analogously after a death, the probability of reaching [n^T) is p{n,T\k — l,i,x). The last term 
describes the process of outcrossing with the offspring fitness distribution parameterized by a recombination function 
Kxx' , which depends on the details of the model. 

In a model of outcrossing where offsprings are produced by mating two individuals at random and re-assorting all 
of their genes, the recombination kernel K^x' , after averaging over the mating partner, is given by 



2(^'- 



0! 



(2) 



In this model, known as the infinitesimal model, the fitness distribution of the offspring fitness is centered around half 



IS mean fitness x{t) 


Barton 


Neher et al. 


2010 


Y (Note 



2009 BULMER 1980) (the model was referred to as the full recombination model in (Neher et al. 2010 )T 
that the variance of offspring fitness relative to parental mean is cr^/2. The variance of 3cr^/4 in Eq. ^ is the 
result of averaging over the mate.) We assume that individuals are polygamous, i.e. each offspring is the result of an 
independent mating event. 

For most of the analytic calculations, we employ an even simpler recombination model, where offspring fitness is 
simply a random sample from the population, independent of the parents. 



1 



V27rcr2 



(3) 



In this communal recombination model new offspring are assembled from the genetic diversity in the entire population, 
i.e. the novel genotype is drawn from a product distribution with the current allele frequencies in the population 

Note, however, that this model does not lead to instantaneous elimination 



( Barton 


2009 




Neher et al. 


2010) 


of correlations between loci (linkage 



generation. The infinitesimal and communal model have been shown to yield very similar results in Neher et al.\ 
(2010). We will confirm this again by simulation of both models. Even though this model is a drastic idealization, it 



might resemble rather closely the recombination process in HIV. Since recombination in HIV depends on coinfection 
of T-cells with several viruses, most recombination occurs in centers with high concentrations of virus ( JuNG et al. 



2002). In these centers, lineages might undergo several successive outcrossing events, effectively producing a cloud of 



offspring that contain genetic material from a large number of parents. 



Results 

Simulation results 



Figure [2] shows simulation results for the fixation time of neutral mutations and the fixation probability of beneficial 
and deleterious mutations for populations of different size and different outcrossing rates. At low r/tr, fixation times 
(Panel A) are reduced by orders of magnitude below the neutral expectation of 2A^ and reach the latter only for r/a is 



well above one, an effect already observed in Charlesworth et al. ( 1992 ) . The scaling of the data in panel A reveals 
that fixation times are not proportional to the population size at low r/a, in which case the curves would lie on top 
of each other. Genetic draft has an equally dramatic effect on the efficacy of selection, which is shown in Fig. ^p. 
The fixation probabilities of beneficial and deleterious mutations are only slightly perturbed from iV~^ if r/cr < 1, 
even though \Ns\ ranges up to 80. The fact that genetic draft reduces fixation times and the efficacy of selection is 
of course well known and it is customary to describe this effect by a neutral model with a reduced population size, 
i.e. treating draft as if it were just an enhancement of genetic drift. This, however, is often not appropriate, as draft 
causes fiuctuations of a very different nature from that of genetic drift. To illustrate the problem, we define 2A"e as the 
fixation time of neutral mutations Tfix . With this definition, a mutation with effect size s should fix with probability 
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jy^-^ ""^-"tj..^^ ■ However, the observed fixation probability does not follow this expectation if r < cr, as shown in Fig. 3 J, 
(note the logarithmic scale). 

Another striking feature where a neutral model with a reduced effective population size fails to capture the effects 
of draft is the allele frequency spectrum, shown in Fig. for various ratios of r/a. For small r/cr, the frequency 
spectrum /(z/) of neutral alleles falls off much more rapidly than the prediction of the diffusion theory: it decays as 
with frequency instead of This effect is often referred to as "excess of rare variants" (Braverman et al.\ 



1995), but would, in fact, more aptly be called a "lack of common variants". Very similar dependencies of the fixation 



time, fixation probabilities, and allele frequency spectra on r/a are observed if fitness variation is not due to multiple 
selective sweeps but due to purifying selection. Simulation results for such a background selection scenario are shown 
in the supplementary figures S2 and S3. 




r/a r/a 

FIG. 2 Genetic draft dramatically reduces fixation times and the efficacy of selection. Panel A) shows mean fixation times Tfi^ 
of neutral mutations, normalized by the populations size N, as a function of r/a for difi'erent N. For r/a < 1, Tfix increases 
only slowly (sublinear) with A^. Panel B) shows the fixation probability of beneficial and deleterious mutations relative to that 
of neutral mutations. Even though \Ns\ > 50, fixation probabilities are close to N^^ at low outcrossing rates. (A'^ = 16000) 



To rationalize the observed behavior we solved analytically a simplified branching process model describing the 
dynamics of alleles in an adapting population; compare Mathematical model. This branching process solution shows 
explicitly how observables depend on population parameters and elucidates the difference between draft and drift. 



Analytic results for the branching process model 

The key to understanding the qualitatively different behavior at low r/a compared to r/a 3> 1 is the fact that 
genotypes with fitness x can expand clonally if x — x — r > 0, i.e. if their growth rate exceeds the rate at which they 
outcross. In contrast to the case of r/cr 3> 1, for r/a < 1 the condition for clonal expansion is fulfilled for a substantial 
fraction of the genotypes in a finite population. A genotype establishes with probability x — r (setting a;(to) = 0) and 
is clonally amplified so that its copy number subsequently grows as 



where T is the time since establishment (Desai and FiSHER 2007 1 . The term = dtx{t) accounts for the 



increasing mean fitness, x{t) = a^T, of the rest of the population, competing with the clone. After establishment, a 
fit genotype therefore gives rise to a clone whose size has a Gaussian profile in time, as illustrated in Fig. |4^. During 
its lifetime, an individual clone spawns an average of ^ = r Jg dtn{t) recombinant genotypes. For x~r ^ a one finds 

^ sa a{x — r)~^e 2<t^ , which increases rapidly with increasing x — r. In the Discussion, we will develop this intuition 
into a simplified model of the genealogy of clones. Fig. |4]3 shows the copy number of the mutant allele on different 
genetic backgrounds as a function of time in the continuous time branching process model. The observed stochastic 
trajectory shows the features embodied in the simplified effective model: An anomalously fit genotype gives rise to 
many recombinant offspring genotypes, resulting in large fiuctuations in the copy number of the mutant allele. 
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FIG. 3 Fixation probabilities and allele frequency spectra are inconsistent with a neutral model with reduced effective population 
size. Panel A) shows the rescaled fixation probability Npfi^ of mutations with effect size s as a function of Tfi^s, where Tfi^ is 
the fixation time of neutral mutations. In a neutral model, Tfix — 2N, hence it seems sensible to define an effective populations 
size Ns = Tfix/2. The fixation probability of a mutation in a neutral model is indicated as black solid line. While the effective 
neutral model describes pjix well for r/a — 6.4, it fails for r/a — 0.6 (note the log scale). Panel B) shows the distribution of 
derived allele frequencies /(u) in a population of A'^ = 32000 individuals for different r/a. For r/a < 1, allele frequency spectra 
fall off much more steeply than the prediction /{u) ~ u'^ of the neutral model. In fact, the distribution falls off more like 
/(u) ~ i^"'^, which is the leading behavior predicted by Eq. (jsj). All curves are normalized to unit area and the two power laws 
are indicated by straight lines. 
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FIG. 4 Clonal expansion results in large fluctuations. Panel A: A clone with high initial fitness can grow to large numbers 
before going extinct. Large clones give rise to many recombinant daughter clones. Panel B shows a corresponding trajectory 
obtained by simulation of the branching process equation, see Model. The mutant allele originated on a background with fitness 
X — x{t) = 2a at time t — 0. Each subclone of the population appears as a vertical line whose horizontal position indicates its 
fitness X. The size of a clone (color coded) rises as long as a; — x{t) — r > and shrinks after the population wave has passed. 
The larger a subclone, the more daughter clones it produces. The mean fitness x{t) as well as x{t) ± 3a are indicated as white 
lines. 
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The dynamics of the distribution p{n,T\x) of a mutant ahele, given that it arose on a genotype with fitness 
X {k = 1, t = 0), can be described by the branching process defined in Eq. ([T]). We will proceed by analyzing the 
fixation probability, the typical time to fixation, and the allele frequency spectra for neutral, beneficial and deleterious 
mutations for the case r/a < 1. In the opposite limit of rapid outcrossing, the effect of selection during the life time 
of a genotype is small and Eq. [ijcan be analyzed pcrturbatively in cr/r. This perturbation analysis was performed 
in ( Neher et aL|[2010 1 . It was shown that the fixation probabilities of beneficial mutations are reduced by a factor 



1 — acr^/r^, where a is numerical factor of order one and depends on the details of the recombination model. This is 



consistent with the classical argument of Robertson ( 1961 1 showing that the cumulative effect of draft is proportional 
to the square of the degree of finkage ( Barton|2009 Santiago and Caballero|1998 ). The analysis and derivation 



of the results is mainly relegated to the appendix: here we present and discuss the results. 

Extinction and survival of neutral mutations 

The fundamental quantity describing the stochastic dynamics of an allele is the distribution, p{n, T), of the number 
of copies that exist T generations after the founding mutation. Note that the mutation could have arisen on genotypes 
with different background fitness x, and p{n,T) is the average over all possible x at time T — 0. Of particular 
importance is the survival probability Ps(T) = 1 — p(0,T). Fig. [5] shows -Ps(r) for a neutral mutation for different 



outcrossing rates in the communal recombination model, obtained by a numerical solution of Eq. (C6). Initially, 
Ps{T) decays as (1 + T)~^, regardless of the outcrossing rate. This is the same behavior seen in homogenous or 
completely neutral populations. Selection on the heterogenous backgrounds starts infiuencing the fate of the allele at 
times T > cr^^, i.e. the reciprocal of the typical selection differentials in the population. After this initial transient, 
the survival probability depends strongly on the relative magnitude of the outcrossing rate and the fitness variance in 



the population, decaying faster for smaller r/a. We show in the Appendix (Eq. (C12| and Appendix A. 2), that the 
time dependence of the survival probability is given by 




Consistent with our previous argument, we find that for r ^ a neutral mutations are unaffected by selection on the 
background, reproducing the familiar result Ps{T) ^ T^^ . However, a strong deviation from this neutral behavior 
occurs as soon as r/a approaches one. In the regime with r < cr, the ratio r/a enters the dynamics of Ps{T) 
as a prefactor of log^ T in the exponent, resulting in faster extinction at lower r. (Yet another regime appears at 
even smaller r for r/a < l/^logiV and is discussed in the Supplement). Eq. ([s]) was derived using the communal 
recombination model. In Fig. [8] we show, via simulation of the branching process, that the infinitesimal model exhibits 
the same asymptotic dependence on parameters. 

Dynamics of surviving alleles and fixation times 

Given that a new allele is not lost after T generations, what is the mean number of its copies? The answer to 
this question is surprisingly simple and will inform us about the scaling of fixation times of neutral alleles. The 
unconditional expected number of copies after time T is obtained by multiplying Eq. [T] by n, summing over n and 
averaging over the distribution of cc at T = 0. One finds the simple expression 



(n(^))=e^^ (6) 

which reduces to (n(T)) = 1 in the neutral case. Here, (...) denotes the average over p(n,T), while the overbar 
denotes the average over the fitness of the founding genotype. Hence, the expected number of copies depends neither 
on recombination, nor on the speed of adaptation. It only depends on the intrinsic fitness effect of the novel allele. 
However, (n(T)) is a product of the survival probability, times the average n conditioned on survival. For neutral 
mutations, the mean n conditioned on survival is therefore the reciprocal of the survival probability, and when Ps{T) 
is of order , a non-extinct allele has reached copy numbers of ~ iV and has essentially fixed. The first passage 
time, Tfp, to allele frequency ^, i.e. to vN copies, is given asymptotically by 



^ I a^r-^e"^ ^2\ogai.N cr / ^/\ogN <r <a ,_, 
^^^'^UiV r»a 
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Time T [generations] Time T [generatims] 

FIG. 5 Survival of novel mutations in the branching process model. Panel A shows the survival probabilities of neutral mutations 
in the communal recombination model for different outcrossing rates as a function of time (obtained through numerical solution 
for the generating function of the branching process). For large r/a, the survival probability closely follows the result for a 
single neutral locus Ps = (l + T)"^ (note the log-log scale). For r/a < 1, Ps{T) decays much more rapidly as soon as T S> (t~^. 
Panel B illustrates how an intrinsic fitness efi'ect s affects the survival probability for r/a = 0.4 (note the log-linear scale). At 
short times, beneficial and deleterious mutations behave like neutral mutations. After a crossover time, which depends on s, 
the survival probability of deleterious mutations continues to decay exponentially with rate s, while the survival probability of 
beneficial mutations becomes time independent. Ps{t) of neutral mutations (s — 0) decays to zero with a decreasing rate. 



We see that the time to reach allele frequency v scales very differently with A'' in the two regimes. Instead of diffusive 
dynamics with diffusion constant ~ N~^, the rapid turnover of individuals by selection at r < a results in large 
fluctuations, which can propel a mutation rapidly to large numbers. Simulation results confirming the validity of the 
argument that inverse survival probability can be used as a proxy for typical copy numbers, as well as the adequacy 
of modeling allele frequency dynamics by a branching process are shown in Fig. [8] 

A different approach to calculate fixation times in a model of continuous adaptation with facultative out-crossing 
has been developed in [RouziNE and Coffin| ( |2007[ |2010| ) . [RouziNE and Coffin| use a coalescence approach based 
on the clonal structure of the population to calculate fixation times and obtain the same exponential dependence on 
rCT-V21og Nr. 



Allele frequency spectra 



In addition to changing the dependence of the fixation time on the population parameters, draft also results in a 



1995 Fay and 



qualitatively different allele frequency spectrum. This has been noted previously ( Braverman et al. 
[Wu„2q00i |Gillespie||2000[ |McVean and Charlesworth|[20001 ) but to our knowledge never calculated explicitly. 
FigurejSp shows the distribution of derived neutral alleles frequencies v — n /N at different outcrossing rates measured 
in individual-based computer simulations. At high outcrossing rates, we observe f{v) ~ as expected for mutations 
subject to random drift alone. However, when draft is dominant, f{v) decays more steeply as 



ar(T^^^2 log Ni/cr 

v"^ log Nva 



(8) 



This steep decay persists up to a crossover frequency that depends on r/cr and the population size. Beyond this 
crossover frequency, the finite population size constraint becomes important. The calculation of the allele frequency 
spectra is detailed in Appendix A. 4 and the result of the copy number distribution is given in Eq. (A22). 

The ](y) ^ v^^ behavior can be understood heuristically by considering the sizes of clones containing the allele. 
Let's consider the asexual case with r — Q. To calculate the copy number distribution of a novel allele, T generations 
after it originated, we have to average over the initial fitness x of the genotype that it could have arisen on. Given that 
the allele arose on a genetic background with fitness x, its average number after T generations is given by Eq. Q. Since 
we know that the distribution of x is P{x,Q) = (27rCT^)^^/^e^^ Z^"' , we can calculate the expected distribution of n 
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after time T within the deterministic approximation of Eq. mI. One finds that p(n, T) ~ exp(— ). 

The allele frequency spectrum is an average over alleles oidifferent age, T, and the above expression therefore has 
to be integrated over T, which yields f{n) ^ -nT^ . In a facultatively sexual population with r < a, allele frequency 
spectra are also driven by the clonal expansion of individual genotypes. However, in contrast to the asexual case, 
additional clones are constantly seeded through outcrossing so that the novel allele resides in many different clones 
of different ages, similar to the average over asexual clones of different ages. For this reason, both the copy number 
spectrum of individual alleles as well as the frequency spectrum averaged over alleles of different ages have a leading 
behavior p(ri,T) ~ and f{iy) ~ This reasoning is confirmed by the branching process calculation given in 

the Appendix [X2I 

In addition to a much steeper spectrum at low frequencies, the spectra of neutral and deleterious mutations are 

). On a linear chromosome with isolated 
sweeps that affect tightly linked neutral variants, this effect is easy to understand: A tightly linked variant is brought 
close to fixation, resulting in a pile-up of derived variants at frequencies close to one. In our case with many unlinked 
sweeps in a facultatively mating population the effect is less intuitive, but ultimately of similar origin. 



non-monotonic and increase again at frequencies near one ( Fay and Wu|2000 



Selection efficiency and quasi-neutrality 

The rapid and erratic dynamics of alleles incurred by selection on other loci not only affects the time to fixation, but 
also the ability of selection to prune deleterious mutations and fix beneficial mutations. Mutations behave essentially 
neutrally, as long as selection on the intrinsic effect of the allele is outweighed by fiuctuations. A beneficial mutation 
has established when it has risen to sufficient numbers that future extinction is improbable and the allele goes to 
fixation deterministically. Without draft, a beneficial mutation is established when it has reached copy numbers of 
about s~^. With additional fluctuations through draft, however, the allele is more likely to go extinct and has to rise to 
much higher numbers to establish. Similarly, draft can propel deleterious mutations to copy numbers they would never 
reach under drift alone. This reduction of the efficacy of selection is a hallmark of Hill- Robertson interference (Hill 
and Robertson|[1966| and was mainly studied in obligate sexuals ( |Barton||1994[ [McVean and Charlesworth 
2000p . Here, we explore these effects in facultative outcrossers 



Since an allele that has survived longer is likely to be present in more copies, the influence of stochastic fluctuations 
and hence the rate at which the mutant allele goes extinct decreases over time. This is explicit in the time derivative 
of the log survival probability, which in the regime r < a is given by (see Appendix A. 3 1 



cr2 log(r3cr-2y) 



(9) 



The first term s accounts for the deterministic bias due to the intrinsic effect of the allele, which causes accelerated 
extinction if s < and preferential survival if s > 0, comp. Fig.jSja. The second term accounts for the stochastic forces 
on the allele, whose importance decreases with T. The time after which the allele's fate is dominated by selection on 
its intrinsic fitness can be roughly determined by equating the two terms in Eq. ([9]). In the regime of intermediate 
outcrossing rates r < a, we find this crossover time to be T* « a'^ \og{r / s) / s . If fixation of a neutral mutation 
occurs before that time, selection has little effect and the fate of the mutation is dominated by fluctuations until 
fixation. Hence, we find a window of quasi-neutrality for mutations with selection coefficient smaller than Sc 



(10) 



Beneficial mutations with effects larger then 



within which the fixation probability of a mutation is pfix N 

lo r/s — 

Sc have a chance of fixation pfi^ — ^ e , as already found in (Neher et al. 20101. Propagation of 

deleterious mutations with \s\ < Sc is also dominated by fluctuations, resulting in allele frequency spectra similar 
to neutral mutations. Only for \s\ > Sc does the allele frequency spectrum f{iy) decay exponentially as expected 
for deleterious mutations. Fixation of deleterious mutations is exponentially suppressed with pfi^ ^ jv^^e"'*'"^-''" . 
Simulation results, qualitatively showing this behavior, are presented in Fig. [2]3 and[3j 

A related phenomenon is observed in the context of a linear chromosome in the limit where only one sweep affects 



a polymorphism at any given time. Barton ( 1994 1 showed that a beneficial allele that is subject to many sequential, 



weakly linked interfering sweeps has little chance of fixation if its selection coefficient is below a critical value. The 
critical value reported by Barton is Sc ~ c^/i?, where a^/R is the fitness variance of sweeps per map length. In this 
approximation, the frequency of the focal allele is repeatedly reduced by a factor that depends on the degree of linkage 
and the strength of the interfering sweeps: this results in an effective reduction in the growth rate of the frequency 
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of the focal allele to s — Sc- In the case of numerous concurrent sweeps that we consider, the situation is different 
because the focal allele is spread over many different backgrounds subject to selection. The older the allele, the more 
likely it is to be spread over many backgrounds, attenuating the effect of interference. Hence we find an additional 
dependence of Sc on the typical fixation time scale and the extent of recombination that occurs during this time. 



Stochastic tunneling and complex adaptations 



Consider an adaptation process where two mutations at the same locus are needed to confer a fitness advantage, 
but either mutation in isolation is neutral or deleterious. The rate of this process is given by the probability per 
unit time that the second mutation occurs in an individual already carrying the first mutation. More generally, the 
probability that the double mutant genotype never existed up to a time T is 



P(T) 



-/J /g^ dt{7ii{t)+n2{t)) 



(11) 



where fi is the mutation rate and ni{t) is the number of individuals carrying mutation i. The deleterious single 
mutants form a transient subpopulation in the large background population, to which we refer as a mutant "bubble" . 
The important quantity determining the rate of such secondary events is the distribution of the cumulative number 
of single mutants w{T) = Jg dtn{t), i.e. the integrated bubble size. For homogeneous background populations, where 
genetic drift dominates the dynamics of neutral alleles, this quantity has been calculated by IWASA et al. (20041; 



Lynch (20101; Weissman et aL| (2009 2010). In most cases of interest the rate of secondary events is small, so that 
the probability of tunneling in a single bubble is low. Hence tunneling takes longer than the lifetime of a typical 
bubble and only the long time limit of w{T) is important. In analogy to the results presented above, we calculate the 
moment generating function ^(z) = (e"^"") of the bubble size w, where (. . .) denotes the average over the bubble size 



distribution P{w). The calculation given in Appendix A. 5 yields for the ^(z 



$(z) 



„r(T ^ V-2 log z 



\s\ < re 
\s\ > re" 



" V-21ogz 



'y=2l 



(12) 



The result for strongly deleterious mutations is the same as in homogenous populations ( Weissman et a/.|2009[ ). The 
crossover between the two regimes occurs essentially at the quasi-neutrality threshold, with TV replaced by z~ . The 
generating function ^(z), which is also the Laplace transform of P{w), is of immediate practical relevance, since it is 
exactly the probability that a secondary event with rate z does not occur within a single bubble. From the Laplace 
transform, we calculate the distribution of P{w), which in the draft dominated regime we find to be 



P{w) 



1 

logu; 



(13) 



The latter has to be compared to the result for drift alone: P{w) ^ w^^/"^. Both of these behaviors are clearly seen 
in the simulation results shown in Fig. [6j Draft causes large neutral bubbles to become rarer, but also dramatically 
enhances the probability of large bubbles for deleterious (quasi-neutral) alleles. 



Discussion 



The importance of hitch-hiking and draft as a source of fluctuations was first discussed by |Maynard Smith and 



Haigh| ( [1974 ) as a possible explanation for the apparent insensitivity of genetic variation on the census population 
dubbed the "paradox of variation" (Lewontin 1974). Draft is expected to dominate over genetic drift in 



size. 



very large populations and hence is the factor determining the level of genetic diversity and the efficacy of selection 
(|BARTON|1995b[ |Gillespie|2001[ ) . It was shown to have appreciable effects on genetic diversity in Drosophila (ISellaI 



et aL||20'09 ). Draft is expected to be even more pronounced in facultatively outcrossing organisms, where alleles stay 
associated with each other for longer periods of time. Such organisms include many plants, fungi, nematodes, and 
viruses, including the human pathogens influenza and HIV. 

We have studied draft in continuously adapting facultatively outcrossing populations using direct computer sim- 
ulation of a model inspired by intra-patient evolution of HIV. We explain the simulation results by analyzing a 
simplified branching process model, which is amenable to analytic calculations that elucidate how fixation time, the 
efhciency of selection and allele frequency spectra depend on fundamental parameters of the population dynamics. 
The simplification that makes these analytic results possible is a "large number" limit: Due to the large number 
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FIG. 6 The cumulative distribution of "bubble sizes", i.e. the integrated copy number w — dtn{t), of deleterious (s = 
—0.005) alleles. The simulated bubble size distributions for r/a 1 agree with the predicted behavior P{w) ~ w'^, which 
corresponds to a cumulative distribution ~ u)~^ . For r ^ a the size of bubbles of deleterious alleles is cut off exponentially. 
While small bubbles are rarer at r ^ cr compared to r ^ ct, this relation is reversed for large bubbles when the exponential 
cuts below the power law (TV — 16000) . 



of concurrent sweeps, the distribution of fitness in the population assumes a Gaussian form and translates towards 
higher fitness with constant velocity. If the outcrossing rate, r, is comparable to or smaller than typical fitness differ- 
ences, (7, between individuals in the population, clonal expansion of genotypes results in a heavy-tailed distribution of 
recombinant offspring with dramatic effects. In particular, we find that alleles with selection coefficients smaller than 
\sc\ = cP' \og{r / s) / {r^Tfix) behave quasi-neutrally: Their fate is dominated by fluctuations and is nearly independent 
of the intrinsic allele fitness s. Yet because selection acting on the allele is masked not by random drift, but by draft - 
transient associations with chromosomal "backgrounds" with different fitness - the behavior is quantitatively different 
from true neutrality. The conventional dynamics dominated by genetic drift is realized only in the limit of no linkage 
or complete neutrality (including the background). To emphasize this distinction we refer to the draft dominated 
regime as quasi-neutrality. 

In addition to a reduced efficacy of selection, draft reduces fixation and coalescence times, which instead of growing 
linearly with the population size, increases much more slowly. The probability that a given neutral locus is polymorphic 
with an allele at intermediate frequency is roughly given by the product of the neutral mutation rate, /i, and the 
coalescence time, Tc- Hence we expect the neutral diversity 8 ~ fj,a^r~^e^'^ V2 logaN ^ Details of the expected 
heterozygosity, however, depend on the allele frequency spectrum. The latter is also the most informative quantity 
that can be evaluated from a static population snapshot. We have shown that in our model, allele frequency spectra 
decay as i>~'^e^'^ V^iog Nva ^ which is much steeper than predicted by drift. As a consequence, in contrast to the 
classic neutral theory (EWENS 2004) one should expect the number of polymorphic loci to increase almost linearly 
(rather than logarithmically) with the sample size as one samples the population deeper and deeper. 



Why draft is different from drift 



Draft and hitch-hiking is often accounted for by an effective neutral model with reduced population size ( Felsen 



STEIN 


1974 


tative 


y but 



1974 Hill and Robertson 1966). This is possible if draft does not change the offspring distribution quali- 
y increases its variance. The Central Limit Theorem then guarantees the existence of a meaningful 
diffusion limit for the allele frequency dynamics with the diffusion constant that could, if one were so inclined, be inter- 
preted as the inverse effective population size. This diffusion limit, however, is not possible in our case since the number 
of recombinant genotypes, ^, that descend from a single clone has a very broad distribution with power-law tails and 
diverging variance. To see this let's consider a genotype seeded by recombination at fitness x ^ a above the mean, 
which establishes with probability x — r. After establishment, its copy number n{t) initially grows almost determinis- 
tically, see Eq. ([4|. Over time, the mean fitness increases such that the growth rate eventually turns negative and the 
genotype goes extinct. During its lifetime, the clone produces S,{x) ~ r dt n{t) ^ e^^""^) 1"^" recombinant offspring. 
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Since the initial fitness x of the genotype is drawn from a Gaussian distribution P{x) = e ^ ^^'^ /VSttct^, the mean 
and second moment of the distribution of the number of recombinant offspring are given by (^) = J dxP{x)^{x) = 1 
and (^^) = / dxP{x)S,'^{x), respectively. While the mean exists, ^^{x) ^ ei^^^f/'^^ increases so rapildy with x that the 
second moment and hence the variance is divergent. Of course, the finite population cuts off the offspring distribution 
and strictly speaking prevents the divergence of the variance. However, this cut-off is on the order of the population 
size and is irrelevant for the dynamics of rare alleles and the existence of a diffusion limit. Below this cut-off, one 
finds that the distribution of ^ behaves asymptotically as 



-rcr IogCr(T-i 

PiO ■ (14) 

The implications of this heavy-tailed distribution of genotypes that descend from a single clone are best understood by 
slight abstraction of the model. Assume for a moment a population dynamics where clones are seeded simultaneously 
and their recombinant offspring only start growing after the last clone of the previous round has died, as illustrated 
in Fig. |4^. This corresponds to an effective discrete generation scheme, only that one generation corresponds to 
the rise and fall of a genotype, rather than a single individual. Each clone spawns a Poisson distributed number of 
recombinant genotypes with mean ^, where the ^ are drawn from the heavy tailed distribution P{^). The generating 

function of the distribution of number of recombinant offspring is found to be P{X) ~ e^'-^^^^'-^^'^ ^ 
Starting from a single genotype, we calculate the distribution Pn{m) of the number of genotypes carrying the mutant 
allele after n effective generations, which has the generating function 

P„(A) = l-0o0...0(A) = l-$„(A) , (15) 

where 0(A) = 1 — P(A) and o denotes functional composition, i.e. / o g(^x) — f{g{x)). From this result, we derive a 
difference equation for $„(A) 



$„+l(A) - <I>„(A) = _e — ^V-21og(*„(A)r.-)^^(^) ^ (Ig) 



where it was assumed that <i>„(A) is small. This difference equation is the discrete analog of Eq. (A6), which is 
derived in the appendix for the continuous time model and from which most of our results follow. Thus the dynamics 
of mutations in a rapidly adapting population can be viewed as population genetics of genotypes, rather than of 
individuals, with a power-law tailed offspring distribution with diverging variance. This feature makes the description 
by an effective neutral Fisher- Wright model impossible, since no diffusion limit exists when increments have such 
long-tailed distributions. A similar effect occurs in Gillespie's pseudo-hitch- hiking model ( Gillespie||2000 I , where a 
hitch-hiking event can bring an allele to instantaneous fixation. 

Coalescent processes with broad and skewed offspring distributions are an active field of research, see for example 



( [MOHLE and Sagitov||2001[ |Schweinsberg||2003[ ) . It is known, that broad offspring distributions can result in 
simultaneous mergers of multiple lineages and have dramatic effects on the time to the most recent common ancestor, 
which can increase sublinearly with the population size. Whether and how our interpretation of the adapting popula- 
tion in terms of a coarse grained coalescent corresponds to a known universality class of coalescent models remains to 
be shown. Coalescent models with broad offspring distributions have been applied to diversity data of pacific oyster 



(Eldon and Wakeley||2Q06 ) and our results suggest that they might apply to a larger class phenomena. 



Adaptation vs purifying selection 



Discussions of Hill-Robertson interference typically focus either on the effect of linked beneficial or deleterious alleles: 
The former is referred t o as "hitch-hiking" (Kaplan et a/.||1989[ [Maynard Smith and Haigh||1974[ ), the later as 
"background selection" ( Charlesworth et al. |1993 " NoRDBORG aL||1996 ). Our approach addresses the effect of 
variation in genetic background fitness independent of its origin. Thus, while our analysis was set up in the context 
of continuous adaptation driven by many simultaneous selective sweeps, it is equally applicable to fitness variation 
dominated by deleterious mutations. The balance between deleterious mutations and selection in large populations 
gives rise to a steady Poisson distribution of fitness ( Charlesworth et al. 1993 Haigh 19781. If the deleterious 



mutation rate Ud is much larger than the selection coefficient sq of mutations, the Poisson distribution is very close 
to Gaussian, as in the case of the adapting populations considered here. While the origin of fitness variation is very 
different in these two cases, the effect on the fate of mutations is similar. In the adaptation scenario, a genotype 
carrying the mutation in questions stays at a certain point along the fitness axis while the mean fitness is increasing. 
When deleterious mutations dominate, the mean fitness is constant and set by a mutation selection balance, while 
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the fitness of asexual offspring of a particular genotype decreases due to accumulating weakly deleterious mutations. 
Since the dynamics is determined by fitness relative to mean fitness, there is little difference between these scenarios 
from the point of view of the focal mutation. We have simulated scenarios where fitness variation is due to purifying 
selection and found that the stochastic dynamics of novel mutations in a purifying selection scenario is very similar to 
the case where fitness variation is due to sweeping mutations. The analogs of Figs. [2] and [3] for background selection 
are shown in the supplementary Figs. S2 and S3. 

In the asexual case, the most relevant quantity for background selection is the size of the least loaded class 
given by Ne~^'^/'^^°\ from which all future individuals descend and within which the dynamics is essentially neutral 
( Charlesworth et a/.|[l993 Haigh|[I978 ). With recombination, beneficial and deleterious mutations can decouple 
from each other and individuals in the bulk of the distribution have a chance of contributing to the future population, 
albeit with a smaller and smaller chance as the distance from the fittest class increases. In the limit of rapid recom- 
bination, this effect can be described by a reduction in the effective population, which has been studied by |HuDSON| 



and Kaplan (19951; Nordborg et al. (1996). Our results imply that background selection changes the population 



genetics more dramatically when the outcrossing rate is small compared to the standard deviation is fitness. 

In any real-world scenario, there will be contributions to fitness variation from beneficial as well as deleterious 
mutations. In fact, in HIV the fitness variation due to deleterious mutations is expected to be substantial: With 
a mutation rate of roughly 0.2 per genome replication (Mansky and Temin 19951 and an average effect s ize of a 
mutation of sq 



-0.01, we have a = ^/Ud\so \ « 0.04. The outcrossing rate is of the same order of magnitude ( Neher 
and Leitner||2010 ), so our results are expected to apply for deleterious mutations alone. 



Sequential versus multiple sweeps 



To illustrate the applicability of isolated versus multiple sweep regimes of selection we compare sweep frequencies 
in Drosophila and in HIV. Genetic divergence between different Drosophila species suggests a rate of amino acid 
substitution of about one every two hundred generations (see (Sella et al. 12009) for review). While these estimates 
come with a rather large statistical and methodological uncertainty, they nevertheless indicate that sweeps are frequent, 
but don't interfere. Each sweep "occupies" a stretch of ^ s/ p nucleotides {p is the recombination rate per nucleotide) 
on a chromosome for ^ generations. Since they come in at a rate of 0.005 per generation spread over the total 
map length, they should on average be far apart: expect on average less than 1% of genome length to be subject to 
draft at any given time. However, sites under weak selection might still cause substantial Hill-Robertson interference 
( [McVean and Charlesworth||2000[ ). 

On the other hand, the multiple sweep regime is likely to be relevant to organisms with a facultative sexual life 
cycle. In particular pathogens like HIV are under constant selection pressure resulting in a high rate of selective 
sweeps. Selection in HIV evolution is best characterized in the pol gene, the target of most anti-retroviral drugs, and 
in the env gene, the target of neutralizing antibodies. On the order of 100 codons are implicated in drug resistance 
|Chen et al. 2004 Rhee et al. 2003) and several such mutations compete and sweep during the e volution of drug 
resistance in a single patient. The env gene frequently builds up a nucleotide diversity of 3-4% (Shankarappa 



effective recombination rate is about 10 per base and generation(LEVY et al. 2004 Neher and Leitner 



et al.\\1999^ , with frequent adaptive amino-acid substitutions ( |Neher and Leitn"er||2010| |Williamson||2003 i. Th e 



20101. 



With typical selection strength of a few percent per generation, the characteristic length of the segment affected by 
the sweep is about Ikb, which is comparable to the size of the evolving genes: hence the dynamics is in the multiple 
sweep regime. While pol and env genes constitu te only a fraction of t he HIV genome, cytotoxic T-lymphocyte (CTL) 
epitopes are found thr oughout the HIV geno me ( KoRBER et aL|2d07) so that even more sweeps associated with CTL 



escapes are expected (Asquith et al. 2006). In a recent study, Hedskog et al. (2010) characterized the evolution 
of the pol gene of HIV in 6 longitudinally sampled patients during anti-retroviral treatment. Using their data, we 
measured the site frequency spectrum of derived alleles, shown in Figure [Tj The allele frequency spectrum is indeed 
much steeper than expected for neutral evolution and compatible with rather than the neutral expectation of 
h'~^. Furthermore, there is little difference between the spectrum of silent and non-synonymous mutations, consistent 
with our notion of quasi-neutrality. Note, however, that steep allele frequency spectra proportional to i/^^ are also 
expected in expanding populations. While the sequences used in Figure [7] are from chronically infected patients, 
changes in drug therapy resulted in shrinking and expanding population sizes and the effects of this expansion and 
draft cannot be disentangled at present. 



In a recent study, RouziNE and Coffin 
to the one used here and in (|Neher et al. 



(2010) present an analysis of adaptation of HIV using a model similar 
2010). Rouzine and Coffin study the problem of selection on standing 



genetic variation and in addition to the speed of adaptation, they compute the rate of coalescence, which in their 
model is controlled by the probability that two individuals in the population are genetically identical, i.e. are part of 
the same clone. In agreement with our result Eq. ill, they find that the coalescence rate is ~ e , where Xmax is 
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the fitness of tlie fittest individuals in tfie population (relative to x). 
in our Eq. Q. 



The quantity Xmax corresponds to ay/2 log aN 



Furthermore, 



RouziNE and Coffin (2010) study how benefical alleles at low frequency can be lost 
due to the lack of recombination. Our work complements RouziNE and Coffin (2010), as we study the stochastic 
dynamics of new alleles arising from mutations, rather than the evolution of the fitness distribution, given standing 
variation. On the other hand, RouziNE and Coffin (2010) account for correlations between loci by allowing the 
distribution of recombinants to be broader than the population distribution ( RouziNE and COFFlNpOOS I- an effect 
absent in our model. In the range of recombination rates where our results apply, the latter deviations are small. 



substitutions, n=352 
silent, n=547 



Derived allele frequency, u 



111" 



FIG. 7 Derived allele frequency spectrum of silent and non-synonymous mutations in the HIV pol gene, aminoacids 180-220 
( [Hedskog et aZ.]|2010 |. The black lines indicates the expectation f{v) ~ u'^ with draft and f{u) ~ u'^ in absence of draft. 
The neutral spectrum ~ fails to capture the steep decay at low frequencies and fits the data only in an intermediate 
frequency range. Note, however, that population expansions can give rise to allele frequency spectra similar to those expected 
under draft. 



Our analytic results are derived using a branching process approximation that assumes that the novel allele is a small 
fraction of the total population and the finiteness of the population does not yet affect its dynamics. The influence 
of the finite population size is apparent in Figure [3]d, causing deviations from the branching process predictions 
at large frequencies. The branching process approach can be complemented by an effective theory for macroscopic 
frequencies, similar in spirit to diffusion theory for random drift. This theory, however, has to account for the 
very broad distribution of clone sizes. Throughout the manuscript, we have assumed that the speed of adaptation is 
identical to the variance in fitness, i.e. we have assumed that effects on fitness of different mutations are purely additive. 
If interactions between mutations contribute to fitness, the variance in fitness tends to be larger than the speed of 
adaptation, since coadapted combinations are destroyed by recombination. Including genetic interactions within our 
framework involves decoupling v and tr, as well as changing the recombination functions. Genetic interaction might 
increase the importance of draft significantly, as interactions h ave a tendency to result in clon al population structures 
and couple the loci along the genome beyond physical linkage (Neher and Shr aiman|[2009 I. Another simplification 
we have made is to assume that all loci segregate independently in an outcrossing event. Our analysis can be extended 
to account for linear chromosomes by considering a hierarchy of recombination rates for chromosomal segments at 
different distances. These, and other extension are interesting projects, which we leave for future work. 

Several large-scale re-sequencing projects are underway with the goal to characterize genetic diversity in populations 
of humans, Drosophila and Arabidopsis at great depth. This upcoming data will allow us to measure allele frequency 
spectra to much greater depth, similar to the example from HIV shown in Figure [7j and reveal the mechanisms 
shaping genetic diversity in natural populations. 
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Appendix A: Appendix: Derivation of tlie results 



To analyze the dynamics of the probability distribution, it is useful to consider the generating function p(A, T, t, x) - 
X^n A"p(rt, T\l, t, x). From Eq.[Tj one obtains (see Supplement) the following equation for 0(A, T, t, x) = l—p{X, T, t, x) 



dt(t>{X,T,t, x) = r dx' K^j;i(j){X,T,t,x') + {x — vt + s — r)(j){X,T,t, x) — ^^(A,T, i,a;) 



(Al) 



with boundary condition 4>{X,T,T,x) = 1 — A. The survival probability is given by evaluating (j>{X,T,t,x) at A = 
0, while the moments and asymptotic behavior oi p{n,T,t) are determined by the behavior of (j){X,T,t, x) in the 
vicinity of A = 1. We have to solve for (f)(X,T,t, x) in these two limits, which will show up repeatedly below. 
It is convenient to remove the explicit dependence on the initial condition and the velocity v — a'^ hy rescaling 
tp{T,t,x) — (j){X,T,t,x)/{ae) with e = 1 — A. The two limits of interest are now e ^ 1 (A,~ 1) ''^^'^ ^ ~ ^ ^ 



(A ~ 0). After rescaling r — a ^r, s — a ^s, and x = — aT, and t = <j{T — t), Eq. C6 takes the form 



drtpir, x) dx'K^x'ip{T, x') + ix + r + s - f)ip{T, x) - eip'^{T, x) , 



(A2) 



where i/'('''7 x) is the rescaled generating function of the copy number distribution of a mutation that initially occurred 
on genome with rescaled background fitness x- The distribution of rescaled fitn ess x iii the population P(x, t) 
1 



-(x-x(r))V2 



is Gaussian around the moving mean fitness xi''')- Averaging Eq. A2 over P{x, t) yields an equation 



for the scaled generating function ^(t) = J dxP{x>T)'^{'''jX)- 



drHr) = dr / dxP{x,r)ij{T,x) = ^t) 



dxP{T,x)i^Hr,x) 



(A3) 



For beneficial mutations, ^(t) approaches a steady state where the two terms on the right balance each other. This 



long time limit was used as solvability condition in Neher et al. (2010) to determine the fixation probability of 
beneficial mutations. The steady state is independent of the initial condition e and therefore equals the rescaled 
survival probability. However, if s < 0, no such steady state exists, since neutral and deleterious mutations have zero 
chance of fixation in an infinite population. To calculate the probability that a mutation reaches n copies after time 
T, we need to find the full time dependent solution of Eq. |A3[ For simplicity, we will use the communal recombination 
model throughout and show that the infinitesimal model yields similar results by simulations of the latter. For the 
communal recombination model, the recombination contribution in Eq. A2 reduces to f$(T). 

In analogy to Neher et al. (20101, we solve Eq. A2 in the two asym ptot ic regimes of large positive and large 

For 6 < the non-linearity in Eq. A2 can be neglected since V'(''', x) ^ 

6-0^ (r, x)- The two asymptotic solution are: 



negative growth rate 9 = x + t + s — r. 
Conversely, at large 9 the dominant balance in Eq. 



A2 



is 61-0 (t,x) 



^(t, x) 



9/e 9>Qc 



(A4) 



The crossover between the two solutions occurs in narrow region at 9 — Qc{t) where &ciT) is approximately the 
point where the two solutions cross. The evolution of ^{t) is governed by Eq. A3 which has to be self-consisted with 
Eq. IA4[ \yc will first solve for $(t) assuming that r is large. In this case the dominant contribution to the integral 
in Eq. |A4| will come from a well-defined r' ^ which is determined by the maximum of exp(— 0'^/2 -I- log<I'(r')) 
at t' — T — 9. ^(t) will turn out to change slowly and its time derivative can be neglected when determining the 
maximum of the exponent. Hence, 



27rf$(r~ 



e 2 



This solution is valid below Oc(t), where ipir) crosses over the linear saturated form 9/e. 
into Eq.[A3| yields 



dr^ir) = s^{t) ~ f$(T)e 



-fec(r)-fV2 



(A5) 

Substituting this solution 



(A6) 



We will first consider neutral mutations, s = 0, in which case $(r)~^97.<I>(r) = _f:e-''©c(T")-r^/2^ confirming that 
log<I'(r) changes slowly as long as f8c ^ 1 (the limit of f8c <C 1 yields qualitatively different answers). Solving 
Eq. (|A6| for 0c (t) and differentiating with respect to r yields an equation for i9t-8c(t), which is readily solved for 
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0c (r) in case f6c >• 1 



Q^^^^ ^ l + W(f3(r-To)e-i) ^ log(r-3(r-ro) + e^Q°) ^^^^ 



where is the initial condition at r = tq and W{x) is Lambert's W, i.e. the solution of W{x)e^^^^ = x. The 
approximation oiW{x) by a logarithm is only accurate at very large time since there are significant log log x corrections. 
The solution for the rescaled generating function then reads 

$(t) = , ' e ~ e (A8) 

ef 

where the last terms only include the exponential factors. Apart from the rescaling, this solution is independent of 
the initial condition e. The dependence on e will reemerge through the matching to the short time solution. Note 
that $(t) satisfies asymptotically for large t: 

a,<J>(r)«-i^^|^<J>(T) (A9) 

which means that the relaxation rate of decreases with time, but more slowly than it does in the case of pure drift 
(in the absence of draft), in which case d-r log $(t) = — 1/t. 

The long time solution in Eq. |C12| has to be matched to the solution at small r where the initial condition cannot 
be neglected. 

1. Short-time solution 

To analyze the short time behavior, it is useful to split i^{T,x) = i^r{T,x) + ^o{t,x) i'^to a contribution fed 
by recombination i/'r(''', x) ^-^d the contribution originating from the initial condition ■0o(t, x), where -0^(0, x) is 
initially zero and V'oCOjX) = 1- In the communal recombination model, the term describing the production of novel 
recombinants reduces to f$(r) and the two contributions evolve according to: 

drMr. X) = r^r) + (?0, - e(20oV^r + ■ (AlO) 

Recombination enters the equation for V'o only through a reduction of the growth rate and the solution for -00 (t, x) 
is simply 

The integral ^o{t) = / dxP{x^ ''')'4'o{ti x) ^cts as a source for ^r{T, x)- The initial behavior of '^q{t) is quite different 
for e ^ 1 and e ^ 1 and we will analyze these cases separately below. 

2. Survival probability of neutral mutations 

Here, we present the steps necessary to arrive at the result for the survival probability (Eq. [5|. The survival 
probability is given by the generating function of p(n, r), evaluated at A = 0. In our rescaled variables, the sur vival 



probability therefore corresponds to ^(t) with e = a^^ ^ 1. The long time solution for $(t) given in Eq. C12 
contains the free parameters tq and 0o, defined by matching with the short time solution which depends explicitly 
on the initial condition (see above). To establish the matching, we trace the short time solution for e ^ 1 through 
several regimes until the initial condition is forgotten. At small times r <C 1, we have 

<i>(r) « <i>o(r) = (A12) 
er + 1 
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This rapid initial decay corresponds to the early loss of the allele due to neutral processes. Selection starts to matter 
only after r ~ 1 yielding an accelerated decay of ^o{t) 



Mr) = 



1 



2tt 



(A13) 



The Gaussian decay of the survival probability of un-recombined alleles is due to the increasing mean fitness, i.e. the 
adapting population wave moving past the genotype the mutation initially arose on. Hence ^oi^) does not contribute 
to the long time behavior, which is dominated by the part of the solution driven by recombination. We therefore 
have to calculate how much weight is transferred from ipQ to ipr via the term f<i>(T) in Eq. AlO 
the dominant balance for ipriT,x) is r^Q{T) 



In the case e » 1, 
resulting in an initially steady ^r{T) — 7^7. After 



the initial condition has decayed and $0 (r) <C $r (''') the solution V'r (t, x) acquires the form of Eq. A4 with the only 
memory of the initial condition in $r (■'')■ The crossover time is obtain ed b y equating $o(''") with <i>r(r), which yields 
Tg — -^/S log(2/f) — 2f . Using this tq and 80 ~ ^—Alogf (comp. Eq. A4), we can match the short time solution to 
the long time solution in Eq. |C12[ 




W[f'Ti--'-''P-) W{f'Tr^-''P) W[r'r'T,,c-'-''P) 

FIG. 8 Asymptotic behavior of the survival probability for the communal and infinitesimal recombination model. Panel A; 
The survival probability $(r), obtained by numerical solution of Eq.JC6| for the communal recombination model, exhibits the 

predicted asymptotic behavior \J —2f'^ log $ ~ W {f^Te~^~^ (Eq. (l5|). Panel B: The survival probability in the infinitesimal 
model (IM) and the communal model (CM), determined by simulating the branching process, exhibit a similar scaling collapse, 
indicating that the dependence on parameters is similar. Panel C: The first passage time, T/p, to allele copy number vN , 
is well described by the branching process approximation for the dynamics of alleles in a finite population as long as the 
frequency v <^1. This is confirmed here by plotting 2f2 log avN vs. W(f ^GTfpe~^~^ ''^). The collapse of data for different 
f and different v confirms the predicted parameter dependence, Eq. ([7|, while the concordance with the branching process 
results confirms the branching process model. Deviation from the branching process prediction are apparent for v = 0.5. 
Recombination rates in panel C follow the same color code as in panels A&B, while black symbols correspond to f = 1. 



To illustrate the validity of the approximations made in the analysis we solved Eq. |A2| numerically. The left panel 
of Fig. [s] shows the square root of the logarithm of the rescaled survival probability, yj—2f'^ log $(t), plotted as a 

function of W{f^Te~^''^ /^). Data for different outcrossing rates collapse onto the same curve, indicating that the 



expression in Eq. (C12) captures the dependence on f. 

The above analysis assumed the communal recombination model, i.e. a model where the offspring fitness indepen- 
dent of the parental fitness. To demonstrate that the communal recombination model captures the behavior of the 
infinitesimal model, we compare results obtained from stochastic simulations of the branching process for the two 
models in Figure |8] The infinitesimal model exhibits the same scaling collapse, indicating that the dependence on 
the parameters is similar. The two models do however show quantitative differences, which can be interpreted as less 
efficient recombination in the infinitesimal model. This is expected, as recombination in the infinitesimal model only 
halves the correlation with either parent, while this correlation is completely destroyed in the communal recombina- 
tion model. Note, however, that the dependence of Ps on f is strong, so that a rescaling of f substantially changes 
the absolute numbers. 

3. Beneficial and deleterious mutations 



To analyze the behavior of $(t) in the case of s ^ 0, we track back to Eq. A6 focusing on the case where fQc ^ 1. 

a^$(r) = {s- fe-^^-')^{T) (A14) 
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This equation highhghts the fact that as long as s <C fe~''®=, the dynamics of ^{t) is dominated by fluctuations and 
is therefore similar to the neutral case. For a negative s, the decay of ^{t) is accelerated, while a positive s slows 
down and eventually halts the decay of $(r), see Figure [5] The crossover to the behavior dominated by the intrinsic 
fitness effect occurs when the two terms equal. Comparing with Eq. |A9| yields the crossover condition 



^,^k#r) , ^ log(^/g) (A15) 

Before this crossover, the effect of selection on s is a simple attenuation or amplification of $(r) — e*'^^'(r), where ^'(t) 
is approximately the neutral solution, with the slight difference that the crossover point 8c ~ \/— 2 log(ef^'(r)) — 2sr. 
The correction to the crossover point is still small when selection on the intrinsic fitness effect s starts to dominate 
the dynamics of $(t). 



4. Distribution of the allele copy number 



The probability p{n, T) of finding n copies of the allele after time T can be calculated from the generating function 
p(A, T) by contour integration in the complex A-plane: 



p(n,T) = - — (h dX— — — 



(A16) 



The contour C has to encircle the origin and no other singularities of p{X, T). For n > 0, we can evaluate the contour 
integral by deforming the contour to run alongside the singularities away from zero, which are tightly controlled by A^" 
for 1. To do so, we have to study the analytic properties of p{X, T) or equivalently of $(t) — a~^e~^{l —p{X, T)), 
where e = 1 — A. For negative e = 1 — A and s > 0, ipix^''') has a singularity at Qc, the value separating the regime 
of small 6 where the linear equation is valid and the saturated regime where iP{t) — 9/e, comp. Eq. A4 is the 

integral of iP{t,x) over x and therefore has a branch cut for negative real e. Otherwise, $(t) is an analytic function 
of e. We can use this branch cut to evaluate the contour integral in Eq. A16 Substituting ^{t) into Eq. |A16 in 
changing the integration variable to e, we have 



p{n,T) 



2iTi 



de 



e<i>(r) (T 

(1 - e)"+i ^ 2TTi 



^gglog e+ne+log $(r) 



(A17) 



where B is the contour encircling the branch cut. For large n the dominant contributions to the integral come from 
e ^ 1 and we have to find a solution for $(r) for small e. 

To this end, we back-track to Eq. All and integrate the evolution equation for ^(t) for e ^ 1. We find 



<i>o(T) 



r < V-21oge 



= e-V-i^^^-'=V2 ^ > ^-21oge 



(A18) 



For intermediate f, the matching between the short term solution and the long time solution occurs at tq = 8o 
2loge, after which we have 



6 



$(t) = ^e-®<=(^) /2 with Q^{t) « log(f3(r - m) + e'^®") 



(A19) 



For large Gq, Q)c{t) ~ + r^(r — ro)e and the branchcut integral becomes 



5(n,r) 



a 
2TTri 



deSoe" 



-«o/2_eg/2-eof^(r-ro)e-^*3o 



B 



The exponent of the integrant peaks when 



(A20) 



(A21) 



For large n, the dominant balance is between the first term of Eq. (A21) and the remainder, which requires that 



20 



n w e®o/^ and hence e n, ^ . Using Oq = to = \/2 log n, the asymptotic copy number distribution becomes 

pKt) ^^^e ^ . (A22) 

71^2 logn 

The leading dependence of p{n, t) on n is t) ^ n^^. Note, however, that the crossterm from the exponent yields 
another subdominant factor e"^'°s". By contrast, the corresponding expression in the high recombination regime is 

The allele frequency spectrum, f{n), i.e. the probability to find n copies of a derived allele irrespective of its age, 
is given by the time average of p{n, t) over mutations that arose at different times r in the past. It depends on how 
quickly the survival probability decays, as well as on the shape of p(n,r). For large n one can expand the logarithm 



in the exponent of Eq. ( A22) and integrate p(n, r) over r to obtain 



r < 1 



/(n) = / dTp{n,T)r^{ "'i°g« ^ (A23) 
Jo [n-^ r>l 

Even though p{n,T) decays much faster for f 3> 1 (exponential) than for f <C 1 (algebraically), the allele frequency 
spectrum is steeper for f <C 1 than for f 1. The long tail of the latter is due to the contribution of very old alleles 
with very flat exponential p{n, t). For f ^ 1, the clonal amplification and rapid extinction of most alleles give rise to 
steep allele frequency spectra. 



5. The double mutation probability: Stochastic tunneling 

To calculate tunneling probabilities ( IWASA et a/.||2004 WeissMAN et aL||2009 ), we need to know the distribution 

of u> = /J dt'n{t') treated as a stochastic variable, i.e. the size of transient "bubbles" of mutant individuals in a large 
population. In analogy to Eq. [ij the distribution p{w,T\k^t,x) for a mutation being present on background a; in A: 
copies obeys the following equation 

~{dt - kd^)p{w, T|fc, t,x)=- k{B + D + r)p{w, T\k, t, x) + kBp{w, T\k + 1, t, x) + kDp(w, T\k - I, t, x) 

+ rk f dw' I dx' Kxx'Piw — w' ,T\k — l,t, x)p(w' ,T\l,t, x') ^ ^ 



Instead of the generating function, we now consider the Laplace transform p{z,T\k,t,x) ~ ^ dze~^'^p(w^T\k^t,x\ 
Thanks to the convolution property of Laplace transforms and the fact that the fate of the k alleles present at time t 
is independent of each other we have: p(z, Tj/c, x) = p^(z^ T|l, i, a;). As before, we scale variables by and times 
by a. The equation for (/)(t, 2,2;) o'^^(l ~ ^(2, i, x)) then is 

a,(^(r, z, x) = z + f *(r, z) + 00(t, z, x) - (/-'(t, z, x) , (A25) 

where = x + ''" + s + 'Z — Atr = a{T — i) = 0, w = and therefore 0(t, z, x) = 0. If tunneling is rare and happens 
only on time scales longer than the life time of "bubbles" , we do not need to know the full time-dependent solution 
for (/)(t, z, x) but can send r to infinity. The above equation can again be solved by matching the asymptotic solutions 
at large negative and positive Q which in the steady state read 

^(z,x)-|^'^[" + '''(^)] (A26) 

The crossover point 0^ is determined by e 2 \z + f $(z)] — 0c. The "solvability condition" obtained by integrating 
Eq. |A25| w.r.t. Gaussian distribution of x yields: 

= z + s^{z) - f -%e-^'/^<p'{z, x) , (A27) 
J V 27r 
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For small f, the integral on the right hand side is dominated by the vicinity of the crossover point Qc- Combining 
with the matching condition, one finds for ^(z) 

Substituting into the crossover condition we obtain an equation for Qc- 

r — s Oc 

z ^ = Qce~^ (A29) 

re '^^^ — s 

In the neutral case <C fe^^^'), the location of the crossover is at 8c = ■\/— 2 log(ze''®<: / Qc) which implies for $(z) 



e^®- - 1 J zV-21ogz fOc < 1 



*w - - ^— = r, v;r (A3o) 



In case the intrinsic effect of the mutation dominates the denominator of Eq. A29 (|s| ^ re ") one finds 



m = ^Al-e-''^')^\ , l^-l (A31) 

\s\ [|f| rec>i 

The result for <i>(z) given by Eq. |A28 and its different limits is directly relevant to the calculation of the probability 
of secondary events, e.g. an additional mutation as discussed in the earlier in Stochastic tunneling and complex 
adaptations. 

For completeness we also calculate the probability distribution of w = dtn{t) which is given by the inverse 
Laplace transform of $(2;): 

PH = £ l^e^"'^- (A32) 

where C is the contour encircling the branch cut. For example in the intermediate asymptotic rQc ^ 1 regime when 
$(z) = z-y/log we have 

= it ^e^-zv/log^ = ^e-y^y Im[v/2 log + z27r] (A33) 

where we used an approximation valid for w ^ 1. This can be compared to the "draft-less" result , obtained also in 
the f ^ 1 limit: in that case $(z) = \/z, inverse Laplace transform of which gives P{w) ^ w"^^^ (Weissman et al. 
20091). 



Appendix B: Computer simulation methods 

Fisher-Wright simulation. To efficiently simulate the dynamics of large populations, we keep track of classes of 
individuals with identical genomes, which are encoded by bitstrings. The fitness x of all individuals of a class is simply 
the sum of the contributions from all loci. In our discrete generation scheme, a pool of gametes is produced to which 
each class contributes a Poisson distributed number of gametes with mean cexp(— (a; — x + a)). Here, c is the size 
of the class, x is the mean fitness in the population and a = 1 — N/N keeps the population size TV approximately 
constant at N. Fitness defined as growth rate in the continuous time model naturally assumes this exponential form 
in a discrete generation model, simply by integration over one generation. 

To implement facultative mating, a fraction r of the gametes are paired up and from each pair two offspring are 
produced by assigning at random the genes of the parents to either offspring. The remaining 1 — r fraction of gametes 
is copied into the next generation without recombination, i.e. they have gone through an asexual reproduction cycle. 

The genome of each class has a fixed length L = 1024 and new mutations are introduced at a locus whenever this 
locus is monomorphic, i.e. whenever a previous mutation either went extinct or fixed. For each monomorphic locus, 
an individual is chosen at random and the mutation is introduced in this single individual. This scheme of keeping a 
fixed number of loci polymorphic has the advantage of making optimal use of the computational resources, i.e. keep 
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every locus polymorphic while maintaining an overall low mutation rate. The mutation rate, however, becomes a 
dependent quantity which fluctuates around a value that depends on other parameters of the population: NUb is 
simply the average number of loci that become monomorphic in one generation, see (Neher et al. 2010). For large 
L, the number of mutations introduced in each generation is much greater than one and doesn't fluctuate greatly. 

To establish a steadily adapting population, 80% of the loci are kept polymorphic with bcncflcial mutations with 
effect size sq which each generation is rescaled such that the overall fltness variance is cr^ = 0.0025. This rescaling is 
done to be able to specify r /a and s/a explicitly. Since fluctuations of a in large populations with many polymorphic 
loci are small, this does not change the properties of the dynamics and the same results are obtained letting a be 
freely determined by the population. The remaining 20% of the loci are used to study the fate of mutations with 
fixed effect size s. Measurements are performed after an equilibration period of 20000 generations at intervals of one 
hundred generations. The source code is available from the authors on request. 

Simulation of the branching process. The simulation of the population dynamics is complemented by a simulation 
of the branching process which can be directly compared to analytic calculations. We simulate the process described 
by Eq. ([I]) using an event driven algorithm ( Gillespie| 19771. The simulation keeps track of all individuals that 
currently carry the allele in question. For each individual, the time t + Ai of the next event is determined by drawing 
a At from an exponential distribution with parameter B(t) + 13 + r, where B and D are the birth and death rates, 
respectively. At time t + At, a birth, death, or recombination event are performed with probabilities proportional to 
B{t + At), D, and r, respectively. In case of death, the individual is deleted, in case of birth it is duplicated with the 
exact same fitness, in case of recombination, its fitness is redrawn according to the recombination kernels in Eqs. ^ 
or ([3]). (Note that the waiting time distribution for the next event is not exactly exponential, since the birth rate is 
time dependent. This, however, amounts only to a correction of order <^ 1.) For the recombination function of the 
communal model, we need not simulate Eq. ([T]) but can solve numerically for the generating function of the T, t), 
given in Eq. (C6|. To this end, we discretize the fitness x and solve for the vector ipiT,x) using the ODE solver of 
SciPy ( Oliphant||2007 ) . The results obtained through stochastic simulation of the communal model agree with the 
numerical solution for the generating function, as it has to be. 
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Appendix C: Supplement 

Derivation of the backward IVIaster equation for p(n, T) 

The fundamental quantity of the branching process is the probabihty distribution, p(n,T), of observing n copies 
of the allele T generations after it originated. A backward equation for p{n, T) can be derived by considering the 
probability p{n,T\k,t,x) of having n copies at time T, given their were k copies on a background with fitness x at 
time t. p{n,T\k,t — At,x) now can be expressed as a sum over possible intermediate states at time t: 

p{n, T|fc, t - At, x) =p{n, T\k, t, x){l - Atk{2 + x- x{t) + s + r)) 

+ At [k{l + x- x{t) + s)p{n, T\k + 1, t, x) + kp{n, T\k-1, t, x)] 

r (CI) 

+ AtrkY^ / K^:,,p{n-n',T\k-l,t,x)p{n',T\l,t,x') , 

n' 

where x{t) is the mean fitness of the population at time t. The different terms have straightforward interpretations: 
The first term is the probability that nothing happens in the small time interval At, the second accounts for a division 
of one of k individuals, which happens with rate k(l + x — x{t) + s), the third term accounts for the death of one of 
the k, while the last term accounts for outcrossing of one of the k, producing a new individual with fitness x' and 
removing one with fitness x. The outcrossing term is then summed over all possible ways the k — 1 individuals with 
fitness x and the one with fitness x' can give rise to n individuals at time T. Sending At to zero and rearranging 
terms results in an ODE for p{n, T\k, t, x) 

-dtp{n, T\k, t,x)=- p{n, T\k, t, x)k{2 + x- x{t) + s + r) 

+ k{l+x- x{t) + s)p{n, T\k + 1, t, x) + kp{n, T\k - 1, t, x) 

r (C2) 

+ rk^ I K^^,p{n-n',T\k-l,t,x)p{n',T\l,t,x') 

This is equation 1 from the main text. 



Derivation of the equation for the generating function 

To remove the convolution over n it is convenient to consider the generating function p(\,T\k,t,x) = 
^^X^p{n,T\k,t,x). Multiplying the above equation by A" and summing over n yields 

-dtp{\, T\k, t,x) =-k{2 + x- x{t) + s + r)p{\, T\k, t, x) 

+ k{l + x- x(t) + s)p{\, r|fc + 1, t, x) + kp{X, T\k - 1, t, x) 

» {^•^) 

+ rk K^^>p{X,T\k - l,t,x)p{X,T\l,t,x) 

Jx' 

All k initial individuals are independent, hence p{X,T\k,t,x) = p''{X,T\t,x) and the right hand side is 
—dtp'^{X,T\t,x) = —kp''~^ {X,T\t, x)dtp{X,T\t, x). We can therefore divide the equation by p^~^{X,T\t,x) to obtain 

-dtp{X, T,t,x)=-{2 + x- x{t) + s + r)p{X, T,t,x) + {l + x- x{t) + s)f{X, T,t,x) + l+r [ K^^'p{X, T, t, x) 

Jx' 

(C4) 

The generating function has the boundary condition p{X,T\T,x) = X, which follows from p{n,T\k,T,x) = 5nk- 
Substituting p{X,T\t,x) = 1 — <f){X,T,t,x) removes the constant term. 

dt(j){X, T,t,x)=-r I K^^,(I){X, T, t, x') - {x - x{t) + s- r)(j){X, T,t,x) + {l + x- x{t) + s)(j)'^{X, T, t, x) (C5) 

J x' 
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which now has boundary condition (f>{X,T,T,x) = 1 — A. Assuming that selection is weak on the timescale of one 
generation (cr <C 1), we can approximate 1 + x — x(t) + s by 1 and arrive at Eq. 18 of the main text: 

-5t0(A, T,t,x)=r f K.,,,q){X, T, t, x') + {x - x{t) + s - r)0(A, T, t, x) - (t)^{X, T, t, x) (C6) 

J x' 



and 



Solution for ^(r) at low f 

In the main text, we presented a solution to the two equations (Eqs. (21) and (23)) 

dr^r) = S^t) - e J dxP{T, x)^'(r, x) (C8) 

in a regime of intermediate f (l/yToglV < f < 1). An additional solution with qualitatively different properties exists 
at low r. While the assumption of a Gaussian fitness distribution is questionable in this range of f, we nevertheless 
present the solution here for completeness. Proceeding as before, we evaluate the integral in Eq. ( C7 1 by expanding 
exponent around its maximum. The maximum is located at t' = r — — a(T') where a(r) — — $(T)~^9r$(T). 
Assuming Q;(r) changes slowly with time, we have 

iP{t) « V2^f$(T)e^^^ (C9) 

This solution is valid below Oc, where iP{t) crosses over the linear saturated form 9/e. In addition to the matching 
condition at Qc, we use Eq. |C8| for s = to determine a: 

5^$(t) = -a$(T) = -f$(T)e®'^("-*^)+"'/2-*^'/2 (CIO) 

For fQc 3> 1 we recover the solution with small a given in the main text. For smaller f, however, we find a^r) w 

Solving the matching condition for Qc and differentiating with respect to r yields drQc = ^-^ct{T). This is readily 
solved for this case of < 1 

ee(T) = (-3(t - To) log f + 93)1/3 . (Cll) 

Substituting this solution for 6c (t) into the expression for the rescaled generating function yields 

, Oc+c)^ [-3(.-.o)lo«f+eg]^/^ 

$(t) = -4e 2 — ^ e 5 (C12) 

er 

Hence in this low r regime, the decay of the survival probability is qualitatively different from the regime of intermediate 
f . 



Effective clone-based model 

To rationalize the behavior of the continuous time branching process, we considered the following simplified model 
discussed in the main text: Genotypes expand clonally and produce recombinant offspring. The offspring start growing 
simultaneously in the next "effective" generation after all clones from the previous generation have disappeared. The 
relevant quantity now is the number of clones or distinct genotypes, rather than the number of individuals. To 
understand the dynamics of the number of clones, we need to know how many clones a single clone can produce. 

Consider a single genotype with background fitness x which is carrying a mutation of effect size s. The expected 
number of recombinant offspring from this genotype is ^ = f dtn^{t), where n^{t) is the copy number trajectory. 
The Laplace transform p(^) of p{^) obeys the equation {(f>{z) = 1 — p{z)) 



9x<l>{x, zj^fz + ix + s-fil- z))4){x, z) - 0(x, z)'^ , 



(C13) 
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which is a simpler version of the Eq. (42) (main text) since only one single clone is considered and recombination to 
daugther clone is ignored. This equation can be solved asymptotically in the regimes of large and small x ~ ^■ 



(x-r(i-z)) x-r:$>ec 



(C14) 



where Qc ~ ^/—2]ogrz. The initial fitness of the genotype, x, is Gaussian distributed and the Laplace transform has 
to be averaged over 



dx 

, ( 



(C15) 



Let's look at the mean number (^) of recombinant offspring first, which is given by differentiating with respect to z 
and setting z = 0, which in turn sends to infinity. Integration by parts yields 



dx 



dx 

— ^=f 
x/2^ 



(C16) 



Hence the mean number of recombinant offspring is I for a neutral mutation and approximately 1 + s/r for mutations 
with small effect. To evaluate the integral in Eq. (CI5 1 at finite z, we have to account for the cross-over of 0(x, z) at 



X — f — Qc -\/— 2 log fz. For small z, the cross-over translates into a cut-off of the integral. Again, the integral can 
be evaluated by parts: 



^ dx _y(f(l_^)_3-)+f^(l-^)V2 



oo 

rz 
f — s 



V2n 

1 _ g-ec(f-s)-fV2' 



X— r(l — z) + s 



dxe-^"/^ + 



dx 

— ^=t 
2tt 



-XV2(^„^(1_^)) 



(CI7) 



Hence, the generating function of the average number of recombinant offspring generated by a random genotype is 
given by 



p{z) = I ~ (/)(z) « I - ^ 



1 _ e-e.(f-5) 



(C18) 



where the last expression is valid for z 1 and f ^ 1. 

The actual number m of recombinant offspring (novel clones) generated by one clone is Poisson distributed with 
mean ^. The generating function of m is therefore 

^A™P(m)=^A'" / d^p(Oe-«^=/ d^ piOe'^^'-'^ ^ p{l - X) ^ I - cP{l - X) (C19) 

We will now use this result to calculate how the number of clones that descend from a particular genotype evolves 
over time. 



The stochastic dynamics of the number genotypes 

As long as the clones we are tracking constitute a small fraction of the population, different clones are independent. 
The probability to go from k to m clones in one effective generation has therefore the generating function P{X, k) ~ 
P'^(A, I) = (f — (f){l — X))^ . To study the dynamics of the number of clones over many generations, we need to know 
how this propagator behaves when iterated. 



m mi,...7nn 



mi ,...mn-i 



[P{1-P(1-P{1-...P{1-X)))) 



(C20) 



= [1 - O (/) . 



(I~A)]'^- = [I-<i>„(I-A)]'= 
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Using the result for the Laplace transform in Eq. (C18), we arrive at the difference equation 



-f V— 2 log r*,! 



(C21) 



This is exactly the differential equation we have derived using the continuous time mode when the effective generation 
time is set to f^^. The latter is reasonable since the is the turnover time by recombination. 



Supplementary Figure: Population distribution 




FIG. 9 The fitness distribution of a population adapting at many loci is Gaussian. The figure shows the fitness distribution 
measured using our computational model for different ratios r/cr. Only at r/a = 0.1 or 0.2, one sees (slight) deviations from 
the Gaussian. Our analysis appUes in the range 1 > r/cr > \/y/\ogN , where the fitness distribution is Gaussian all the way 
into the tails. 



Supplementary Figures: Background Selection 
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FIG. 10 Fixation times and fixation probabilities with background selection, in analogy to Fig. 2 of the manuscript. The left 
panel shows the mean fixation time of neutral mutations in populations of different sizes at different ratios of r/cr. Fixation times 
are normalized by A''. In contrast to Fig. 2 of the manuscript, the fitness variation here does not result from sweeping beneficial 
mutations but from many deleterious mutations. Comparing Fig. 2 to this figure, one sees that the effect of background selection 
on the fixation time of neutral mutations is very similar to that of multiple sweeps in a facultatively sexual population. The 
right panel shows the fixation probability of mutations with different effects on fitness for different ratios r/cr, normalized to 
the neutral expectation A''^^. Again, the fitness variance is due to background selection rather than sweeps, but the effect on 
the fixation probability is similar. These observations are consistent with the argument that from the perspective of a novel 
mutation, the nature of the fitness variation is irrelevant. What matters is the dynamics of the fitness of individual genotypes 
relative to the mean: For background selection, the fitness of genotypes declines due to accumulation of deleterious mutations, 
while in the case of adaptation the mean increases steadily. 




FIG. 11 Allele frequency spectra of neutral (left) and deleterious mutations (right) in a background selection scenario, in 
analogy to Fig. 3b of the manuscript. At low recombination rates (r/cr < 1), the frequency spectrum of neutral mutations 
falls off much more rapidly than expected in a neutral model, very similar to what is observed for the scenario of continuous 
adaptation. The predicted behavior ~ is indicated by the steeper black line. Only when r/cr ^ 1 does the spectrum agree 
with the neutral prediction (~ indicated by upper straight black line). The right panel shows the frequency spectrum of 
the deleterious mutations responsible for the fitness variation with effect size so = —0.004. At low recombination rates, allele 
frequencies are close to fixation, either in the bad (y = 1) or the good (y = 0) state. At high recombination rates, the allele 
frequency spectra are distributed around their equilibrium value v = fj,/ so = 0.0625. 



